Skip to content

ML-Regression

Linear Regression is used for regression tasks to predict continuous outputs by minimizing the mean squared error. Ridge Regression is a regularized version of linear regression that adds a penalty term to the loss function to prevent overfitting. Logistic Regression is used for binary classification tasks to predict probabilities and classify data points.

Linear Regression

1. Hypothesis

Linear regression models the output y as a linear function of the input features x:

hθ(x)=θTx=θ0+θ1x1++θnxn

2. Cost Function

The cost function used is the Mean Squared Error (MSE):

J(θ)=12mi=1m(hθ(x(i))y(i))2

3. Optimization

  • Gradient Descent:θj:=θjα1mi=1m(hθ(x(i))y(i))xj(i)
  • Normal Equation:θ=(XTX)1XTy

Logistic Regression

1. Hypothesis

Logistic regression maps the linear function to a probability using the sigmoid function:

hθ(x)=g(θTx),g(z)=11+ez
  • Prediction Rule:
    • Predict y=1 if hθ(x)0.5.
    • Predict y=0 if hθ(x)<0.5.

2. Cost Function

The cost function for logistic regression is:

J(θ)=1mi=1m[y(i)log(hθ(x(i)))(1y(i))log(1hθ(x(i)))]

3. Optimization

  • Gradient Descent:θj:=θjα1mi=1m(hθ(x(i))y(i))xj(i)

4. Sigmoid Function Properties

  • g(z) maps inputs to [0,1], representing probabilities.
  • Derivative:g(z)=g(z)(1g(z))

Ridge Regression

岭回归是一种线性回归的正则化变体,通过在损失函数中加入 L2 正则化项来解决 多重共线性过拟合 问题。

Loss function

岭回归的目标是最小化以下损失函数:

J(θ)=12mi=1m(hθ(x(i))y(i))2+λj=1nθj2
  • λ: 正则化参数,控制正则化强度。
    • λ=0: 等价于普通的线性回归。
    • λ: 所有 θj0
  • j=1nθj2: L2 正则化项,用于约束参数大小,防止过拟合。

岭回归有一个闭式解,可以通过修改普通线性回归的正规方程得到:

θ=(XTX+λI)1XTy
  • X: 设计矩阵。
  • I: 单位矩阵。

梯度下降解 通过梯度下降最小化损失函数:

θj:=θjα(1mi=1m(hθ(x(i))y(i))xj(i)+2λθj)
  • 正则化部分 2λθj 是对 θ 的惩罚。

Logistic Regression

hθ=g(θTx)g(z)=11+ez

Suppose predict

"y=1" if hθ(x)0.5 (θTx0) "y=0" if hθ(x)<0.5 (θTx<0)

θTx=0

Cost Function and gradient

The cost function for logistic regression is defined as:

Cost(hθ(x),y)={log(hθ(x))if y=1log(1hθ(x))if y=0

When y=1:

  • Cost=0 if hθ(x)=1.

  • As hθ(x)0, Cost.

  • When hθ(x)=0, the model predicts P(y=1|x;θ)=0, but y=1.

When y=0:

  • Cost=0 if hθ(x)=0
  • Cost goes to infinite if hθ(x)=1

Simplification of Logistic Regression Cost Function

The overall cost function for logistic regression is defined as:

J(θ)=1mi=1mCost(hθ(x(i)),y(i))

Cost Function:

Cost(hθ(x),y)={log(hθ(x))if y=1log(1hθ(x))if y=0

Note: y is always 0 or 1.

J(θ)=1mi=1m[y(i)log(hθ(x(i)))(1y(i))log(1hθ(x(i)))]

To fit parameters θ: minθJ(θ)

To make a prediction given new x:

Output hθ(x)=11+eθTx

The cost function for logistic regression is:

J(θ)=1mi=1m[y(i)log(hθ(x(i)))(1y(i))log(1hθ(x(i)))]

Gradient Descent Algorithm:

Repeat:

θj:=θjαθjJ(θ)θj:=θjα1mi=1m(hθ(x(i))y(i))xj(i)

(Simultaneously update all θj).

g(z)=11+ez

Derivative:

g(z)=g(z)(1g(z))hθ(x)=11+eθTx